Kernel Machine Based Feature Extraction Algorithms for Regression Problems

نویسندگان

Csaba Szepesvári

András Kocsor

Kornél Kovács

چکیده

In this paper we consider two novel kernel machine based feature extraction algorithms in a regression settings. The first method is derived based on the principles underlying the recently introduced Maximum Margin Discimination Analysis (MMDA) algorithm. However, here it is shown that the orthogonalization principle employed by the original MMDA algorithm can be motivated using the well-known ambiguity decomposition, thus providing a firm ground for the good performance of the algorithm. The second algorithm combines kernel machines with average derivative estimation and is derived from the assumption that the true regressor function depends only on a subspace of the original input space. The proposed algorithms are evaluated in preliminary experiments conducted with artificial and real datasets. 1 FEATURE EXTRACTION BASED ON AMBIGUITY DECOMPOSITION In this article we consider regression problems, where the data (Xi, Yi) are independent, identically distributed random variables, L is loss function such as e.g. quadratic loss function L(y, z) = (y − z), and we seek to determine the regressor f(x) = argminy E[L(Y, y)|X = x]. Let us first consider the model Y = ∑ i βigi(X) + 2, where gi : X → R are unknown functions, and 2 is noise variable, independent of Y, X . We shall consider estimating gi by means of an iterative procedure. One view of the model is then to treat the Y = β γ + 2 as a linear regression problem, where γ = (g1, . . . , gm). 1.1 Ambiguity decomposition In this section we shall assume that the vector β is such that 0 ≤ β ≤ 1, β e = 1, where e = (1, 1, . . . , 1) , i.e., the output can be obtained as a noisy convex combination of the ‘features’ g1(X), . . . , gm(X). We shall further assume that the loss function is the quadratic loss. Let g = ∑ i βigi, f arbitrary. Then, it is not hard to see that Loss(g) = ∑ i βi Loss(gi) − ∑ i βiE[(gi(X) − g(X))] and Loss(g) = E[(g(X) − f(X))]. This formula, first given in [2] is called “ambiguity decomposition” (AD). The ensemble loss can be decreased if the ambiguity of the ensemble is maximized whilst keeping the loss of the individual members low. 1 Computer and Automation Research Institute of the Hungarian Academy of Sciences, Budapest, Hungary email: [email protected] 2 Research Group on Artificial Intelligence of the Hungarian Academy of Sciences and University of Szeged, Szeged, Hungary email: {kocsor,kkornel}@inf.u-szeged.hu Now, we obtain easily ∑ i βiE[(gi(X)− g(X))] = ∑ i (β i − βi) ( E[gi(X)] 2

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparative Analysis of Machine Learning Algorithms with Optimization Purposes

The field of optimization and machine learning are increasingly interplayed and optimization in different problems leads to the use of machine learning approaches‎. ‎Machine learning algorithms work in reasonable computational time for specific classes of problems and have important role in extracting knowledge from large amount of data‎. ‎In this paper‎, ‎a methodology has been employed to opt...

متن کامل

Automatic Feature Selection via Weighted Kernels and Regularization

Selecting important features in non-linear kernel spaces is a difficult challenge in both classification and regression problems. We propose to achieve feature selection by optimizing a simple criterion: a feature-regularized loss function. Features within the kernel are weighted, and a lasso penalty is placed on these weights to encourage sparsity. We minimize this feature-regularized loss fun...

متن کامل

KNIFE: Kernel Iterative Feature Extraction

Selecting important features in non-linear or kernel spaces is a difficult challenge in both classification and regression problems. When many of the features are irrelevant, kernel methods such as the support vector machine and kernel ridge regression can sometimes perform poorly. We propose weighting the features within a kernel with a sparse set of weights that are estimated in conjunction w...

متن کامل

Kernel-based Fuzzy Feature Extraction Method and Its Application to Face Image Classification

The Hughes phenomenon (or the curse of dimensionality) shows two essential directions for improving the classification performance on high-dimensional and small sample size (SSS) problems. One is to reduce the dimensionality of applied data by feature extraction or feature selection methods. The other is to increase the training sample size. In recent years some kernel-based feature extraction ...

متن کامل

Development of a Pharmacogenomics Model based on Support Vector Regression with Optimal Features Selection Approach to Determine the Initial Therapeutic Dose of Warfarin Anticoagulant Drug

Introduction: Using artificial intelligence tools in pharmacogenomics is one of the latest bioinformatics research fields. One of the most important drugs that determining its initial therapeutic dose is difficult is the anticoagulant warfarin. Warfarin is an oral anticoagulant that, due to its narrow therapeutic window and complex interrelationships of individual factors, the selection of its ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2004

Kernel Machine Based Feature Extraction Algorithms for Regression Problems

نویسندگان

چکیده

منابع مشابه

Comparative Analysis of Machine Learning Algorithms with Optimization Purposes

Automatic Feature Selection via Weighted Kernels and Regularization

KNIFE: Kernel Iterative Feature Extraction

Kernel-based Fuzzy Feature Extraction Method and Its Application to Face Image Classification

Development of a Pharmacogenomics Model based on Support Vector Regression with Optimal Features Selection Approach to Determine the Initial Therapeutic Dose of Warfarin Anticoagulant Drug

عنوان ژورنال:

اشتراک گذاری